Introduction to Gadfly¶
In [4]:
Copied!
using Pkg
Pkg.add("RDatasets")
using Pkg
Pkg.add("RDatasets")
Updating registry at `~/.julia/registries/General` Resolving package versions... Installed Libffi_jll ───────── v3.2.2+1 Installed Mocking ──────────── v0.7.3 Installed PooledArrays ─────── v1.3.0 Installed ExprTools ────────── v0.1.6 Installed InlineStrings ────── v1.0.1 Installed CodecZlib ────────── v0.7.0 Installed TranscodingStreams ─ v0.9.6 Installed TimeZones ────────── v1.6.2 Installed FilePathsBase ────── v0.9.15 Installed WeakRefStrings ───── v1.4.1 Installed DataFrames ───────── v1.2.2 Installed RData ────────────── v0.8.3 Installed FileIO ───────────── v1.11.2 Installed CSV ──────────────── v0.9.11 Installed RDatasets ────────── v0.7.6 Updating `~/NotebooksForDataScience/docs/julia/Project.toml` [ce6b1742] + RDatasets v0.7.6 Updating `~/NotebooksForDataScience/docs/julia/Manifest.toml` [336ed68f] + CSV v0.9.11 [324d7699] + CategoricalArrays v0.10.2 [944b1d66] + CodecZlib v0.7.0 [a93c6f00] + DataFrames v1.2.2 [e2ba6199] + ExprTools v0.1.6 [5789e2e9] + FileIO v1.11.2 [48062228] + FilePathsBase v0.9.15 [842dd82b] + InlineStrings v1.0.1 [78c3b35d] + Mocking v0.7.3 [2dfb63ee] + PooledArrays v1.3.0 [df47a6cb] + RData v0.8.3 [ce6b1742] + RDatasets v0.7.6 [f269a46b] + TimeZones v1.6.2 [3bb67fe8] + TranscodingStreams v0.9.6 [ea10d353] + WeakRefStrings v1.4.1 [e9f186c6] ↑ Libffi_jll v3.2.2+0 ⇒ v3.2.2+1 Building TimeZones → `~/.julia/scratchspaces/44cfe95a-1eb2-52ea-b672-e2afdf69b78f/8de32288505b7db196f36d27d7236464ef50dba1/build.log` Precompiling project... ✓ TranscodingStreams ✓ Libffi_jll ✓ PooledArrays ✓ FilePathsBase ✓ CodecZlib ✓ WeakRefStrings ✓ FileIO ✓ Wayland_jll ✓ Glib_jll ✓ Wayland_protocols_jll ✓ Cairo_jll ✓ xkbcommon_jll ✓ CSV ✓ HarfBuzz_jll ✓ Qt5Base_jll ✓ libass_jll ✓ FFMPEG_jll ✓ FFMPEG ✓ GR_jll ✓ GR ✓ DataFrames ✓ RData ✓ RDatasets ✓ Plots ✓ StatsPlots 25 dependencies successfully precompiled in 105 seconds (214 already precompiled)
Preparing Data¶
In [5]:
Copied!
using Gadfly, RDatasets
iris = dataset("datasets", "iris")
using Gadfly, RDatasets
iris = dataset("datasets", "iris")
Out[5]:
150 rows × 5 columns
| SepalLength | SepalWidth | PetalLength | PetalWidth | Species | |
|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Cat… | |
| 1 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 2 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 3 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 4 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 5 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
| 6 | 5.4 | 3.9 | 1.7 | 0.4 | setosa |
| 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 8 | 5.0 | 3.4 | 1.5 | 0.2 | setosa |
| 9 | 4.4 | 2.9 | 1.4 | 0.2 | setosa |
| 10 | 4.9 | 3.1 | 1.5 | 0.1 | setosa |
| 11 | 5.4 | 3.7 | 1.5 | 0.2 | setosa |
| 12 | 4.8 | 3.4 | 1.6 | 0.2 | setosa |
| 13 | 4.8 | 3.0 | 1.4 | 0.1 | setosa |
| 14 | 4.3 | 3.0 | 1.1 | 0.1 | setosa |
| 15 | 5.8 | 4.0 | 1.2 | 0.2 | setosa |
| 16 | 5.7 | 4.4 | 1.5 | 0.4 | setosa |
| 17 | 5.4 | 3.9 | 1.3 | 0.4 | setosa |
| 18 | 5.1 | 3.5 | 1.4 | 0.3 | setosa |
| 19 | 5.7 | 3.8 | 1.7 | 0.3 | setosa |
| 20 | 5.1 | 3.8 | 1.5 | 0.3 | setosa |
| 21 | 5.4 | 3.4 | 1.7 | 0.2 | setosa |
| 22 | 5.1 | 3.7 | 1.5 | 0.4 | setosa |
| 23 | 4.6 | 3.6 | 1.0 | 0.2 | setosa |
| 24 | 5.1 | 3.3 | 1.7 | 0.5 | setosa |
| 25 | 4.8 | 3.4 | 1.9 | 0.2 | setosa |
| 26 | 5.0 | 3.0 | 1.6 | 0.2 | setosa |
| 27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
| 28 | 5.2 | 3.5 | 1.5 | 0.2 | setosa |
| 29 | 5.2 | 3.4 | 1.4 | 0.2 | setosa |
| 30 | 4.7 | 3.2 | 1.6 | 0.2 | setosa |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
In [6]:
Copied!
p = plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point)
p = plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point)
Out[6]:
In [7]:
Copied!
img = SVG("iris_plot.svg", 14cm, 8cm)
draw(img, p)
img = SVG("iris_plot.svg", 14cm, 8cm)
draw(img, p)
Out[7]:
false
In [8]:
Copied!
plot(iris, x=:SepalLength, y=:SepalWidth)
plot(iris, x=:SepalLength, y=:SepalWidth)
Out[8]:
In [9]:
Copied!
function get_to_it(d)
ppoint = plot(d, x=:SepalLength, y=:SepalWidth, Geom.point)
pline = plot(d, x=:SepalLength, y=:SepalWidth, Geom.line)
ppoint, pline
end
ps = get_to_it(iris)
map(display, ps)
function get_to_it(d)
ppoint = plot(d, x=:SepalLength, y=:SepalWidth, Geom.point)
pline = plot(d, x=:SepalLength, y=:SepalWidth, Geom.line)
ppoint, pline
end
ps = get_to_it(iris)
map(display, ps)
Out[9]:
(nothing, nothing)
In [10]:
Copied!
plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point, Geom.line)
plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point, Geom.line)
Out[10]:
In [11]:
Copied!
SepalLength = iris.SepalLength
SepalWidth = iris.SepalWidth
plot(x=SepalLength, y=SepalWidth, Geom.point, Guide.xlabel("SepalLength"), Guide.ylabel("SepalWidth"))
SepalLength = iris.SepalLength
SepalWidth = iris.SepalWidth
plot(x=SepalLength, y=SepalWidth, Geom.point, Guide.xlabel("SepalLength"), Guide.ylabel("SepalWidth"))
Out[11]:
In [12]:
Copied!
plot(iris, x=:SepalLength, y=:SepalWidth, color=:Species, Geom.point);
# or equivalently for Arrays:
Color = iris.Species
plot(x=SepalLength, y=SepalWidth, color=Color, Geom.point,
Guide.xlabel("SepalLength"), Guide.ylabel("SepalWidth"),
Guide.colorkey(title="Species"))
plot(iris, x=:SepalLength, y=:SepalWidth, color=:Species, Geom.point);
# or equivalently for Arrays:
Color = iris.Species
plot(x=SepalLength, y=SepalWidth, color=Color, Geom.point,
Guide.xlabel("SepalLength"), Guide.ylabel("SepalWidth"),
Guide.colorkey(title="Species"))
Out[12]:
In [15]:
Copied!
y1 = [0.1, 0.26, NaN, 0.5, 0.4, NaN, 0.48, 0.58, 0.83]
plot(x=1:9, y=y1, Geom.line, Geom.point,
color=["Item 1"], linestyle=[:dash], size=[3pt],
layer(x=1:10, y=rand(10), Geom.line, Geom.point,
color=["Item 2"], size=[5pt], shape=[Shape.square]),
layer(x=1:10, y=rand(10), color=[colorant"hotpink"],
linestyle=[[8pt, 3pt, 2pt, 3pt]], Geom.line))
y1 = [0.1, 0.26, NaN, 0.5, 0.4, NaN, 0.48, 0.58, 0.83]
plot(x=1:9, y=y1, Geom.line, Geom.point,
color=["Item 1"], linestyle=[:dash], size=[3pt],
layer(x=1:10, y=rand(10), Geom.line, Geom.point,
color=["Item 2"], size=[5pt], shape=[Shape.square]),
layer(x=1:10, y=rand(10), color=[colorant"hotpink"],
linestyle=[[8pt, 3pt, 2pt, 3pt]], Geom.line))
Out[15]:
In [16]:
Copied!
set_default_plot_size(21cm, 8cm)
mammals = dataset("MASS", "mammals")
p1 = plot(mammals, x=:Body, y=:Brain, label=:Mammal, Geom.point, Geom.label)
p2 = plot(mammals, x=:Body, y=:Brain, label=:Mammal, Geom.point, Geom.label,
Scale.x_log10, Scale.y_log10)
hstack(p1, p2)
set_default_plot_size(21cm, 8cm)
mammals = dataset("MASS", "mammals")
p1 = plot(mammals, x=:Body, y=:Brain, label=:Mammal, Geom.point, Geom.label)
p2 = plot(mammals, x=:Body, y=:Brain, label=:Mammal, Geom.point, Geom.label,
Scale.x_log10, Scale.y_log10)
hstack(p1, p2)
Out[16]:
In [18]:
Copied!
using Printf
Diamonds = dataset("ggplot2", "diamonds")
p3 = plot(Diamonds, x=:Price,y=:Carat, Geom.histogram2d(xbincount=25, ybincount=25),
Scale.x_continuous(format=:engineering))
p4 = plot(Diamonds, x=:Price, y=:Carat, Geom.histogram2d(xbincount=25, ybincount=25),
Scale.x_continuous(format=:plain),
Scale.y_sqrt(labels=y->@sprintf("%i", y^2)),
Scale.color_log10(minvalue=1.0, maxvalue=10^4),
Guide.yticks(ticks=sqrt.(0:5)))
hstack(p3, p4)
using Printf
Diamonds = dataset("ggplot2", "diamonds")
p3 = plot(Diamonds, x=:Price,y=:Carat, Geom.histogram2d(xbincount=25, ybincount=25),
Scale.x_continuous(format=:engineering))
p4 = plot(Diamonds, x=:Price, y=:Carat, Geom.histogram2d(xbincount=25, ybincount=25),
Scale.x_continuous(format=:plain),
Scale.y_sqrt(labels=y->@sprintf("%i", y^2)),
Scale.color_log10(minvalue=1.0, maxvalue=10^4),
Guide.yticks(ticks=sqrt.(0:5)))
hstack(p3, p4)
Out[18]:
In [20]:
Copied!
mtcars = dataset("datasets", "mtcars")
labeldict = Dict(4=>"four", 6=>"six", 8=>"eight")
p5 = plot(mtcars, x=:Cyl, color=:Cyl, Geom.histogram,
Scale.x_discrete(levels=[4, 6, 8]), Scale.color_discrete(levels=[4, 6, 8]))
p6 = plot(mtcars, x=:Cyl, color=:Cyl, Geom.histogram,
Scale.x_discrete(labels=i->labeldict[i], levels=[8, 6, 4]),
Scale.color_discrete(levels=[8, 6, 4]))
hstack(p5, p6)
mtcars = dataset("datasets", "mtcars")
labeldict = Dict(4=>"four", 6=>"six", 8=>"eight")
p5 = plot(mtcars, x=:Cyl, color=:Cyl, Geom.histogram,
Scale.x_discrete(levels=[4, 6, 8]), Scale.color_discrete(levels=[4, 6, 8]))
p6 = plot(mtcars, x=:Cyl, color=:Cyl, Geom.histogram,
Scale.x_discrete(labels=i->labeldict[i], levels=[8, 6, 4]),
Scale.color_discrete(levels=[8, 6, 4]))
hstack(p5, p6)
Out[20]:
In [24]:
Copied!
x, y = 0.55*rand(4), 0.55*rand(4)
plot(Coord.cartesian(xmin=0, ymin=0, xmax=1.0, ymax=1.0),
layer(x=x, y=y, shape=["A"], alpha=["day", "day", "day", "night"]),
layer(x=1.0 .- y[1:3], y=1.0 .- x[1:3], shape=["B", "C", "C"], alpha=["night"]),
Scale.shape_discrete(levels=["A", "B", "C"]),
Scale.alpha_discrete(levels=["day", "night"]),
Theme(discrete_highlight_color=identity, point_size=12pt,
point_shapes=[Shape.circle, Shape.star1, Shape.star2], alphas=[0, 1.0],
default_color="midnightblue")
)
x, y = 0.55*rand(4), 0.55*rand(4)
plot(Coord.cartesian(xmin=0, ymin=0, xmax=1.0, ymax=1.0),
layer(x=x, y=y, shape=["A"], alpha=["day", "day", "day", "night"]),
layer(x=1.0 .- y[1:3], y=1.0 .- x[1:3], shape=["B", "C", "C"], alpha=["night"]),
Scale.shape_discrete(levels=["A", "B", "C"]),
Scale.alpha_discrete(levels=["day", "night"]),
Theme(discrete_highlight_color=identity, point_size=12pt,
point_shapes=[Shape.circle, Shape.star1, Shape.star2], alphas=[0, 1.0],
default_color="midnightblue")
)
Out[24]:
In [25]:
Copied!
gasoline = dataset("Ecdat", "Gasoline")
plot(gasoline, x=:Year, y=:LGasPCar, color=:Country, Geom.point, Geom.line)
gasoline = dataset("Ecdat", "Gasoline")
plot(gasoline, x=:Year, y=:LGasPCar, color=:Country, Geom.point, Geom.line)
Out[25]:
In [26]:
Copied!
fig1a = plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point)
fig1b = plot(iris, x=:SepalWidth, Geom.bar)
fig1 = hstack(fig1a, fig1b)
fig1a = plot(iris, x=:SepalLength, y=:SepalWidth, Geom.point)
fig1b = plot(iris, x=:SepalWidth, Geom.bar)
fig1 = hstack(fig1a, fig1b)
Out[26]:
Gadfly는 Grammar of graphics